Mutations in the HBV PreS/S gene related to hepatocellular carcinoma in Vietnamese chronic HBV-infected patients

Background Chronic hepatitis B virus (CHB) infection is a major health problem and leading cause of hepatocellular carcinoma (HCC) worldwide. Several point and deletion mutations on the PreS/S gene have been intensively considered associated with HCC. This study aimed to describe the characteristics of HBV PreS/S mutations in Vietnamese CHB-infected patients and their association with HCC. Methods This cross-sectional study was conducted from 02/2020 to 03/2021, recruited Vietnamese CHB-infected patients with HBV-DNA >3 log10-copies/mL and successful PreS/S gene sequencing. Mutations were detected by direct Sanger sequencing. Results 247 CHB-infected patients were recruited, characterized by 68.8% males, 54.7% HBV genotype B, 57.5% HBeAg positive, 23.1% fibrosis score ≥F3 and 19.8% HCC. 61.8% amino acid replacements were detected throughout the PreS1/PreS2/S genes. The most common point-mutations included N/H51Y/T/S/Q/P (30.4%), V68T/S/I (44.9%), T/N87S/T/P (46.2%) on PreS1 gene; T125S/N/P (30.8%), I150T (42.5%) on PreS2 gene; S53L (37.7%), A184V/G (39.3%), S210K/N/R/S (39.3%) on S gene. The rates of case(s) with any point-mutation on the Major Hydrophylic Region (MHR) and the "a" determinant region were 63.6% and 39.7%, respectively. Most of S point-mutations were presented with low rates such as T47A/E/V/K (9.3%), P120S/T (8.5%), G145R (2%). On multivariable analysis, males (OR = 4.51, 95%CI 1.78–11.4, p = 0.001), age≥40 (OR = 5.5, 95%CI 2.06–14.68, p = 0.001), W4P/R/Y on PreS1 (OR = 11.56, 95%CI 1.99–67.05, p = 0.006) and 4 S point-mutations as: T47A/E/V/K (OR = 3.67, 95%CI 1.19–11.29, p = 0.023), P120S/T (OR = 3.38, 95%CI 1.09–10.49, p = 0.035), S174N (OR = 29.73, 95%CI 2.12–417.07, p = 0.012), P203R (OR = 8.45, 95%CI 1.43–50.06, p = 0.019) were associated with HCC. Conclusions We detected 61% amino acid changes on PreS/S regions in Vietnamese CHB patients. One point-mutation at amino acid 4 on PreS1 gene and 4 point-mutations at amino acids 47, 120, 174, and 203 on S gene were associated with HCC. Further investigations are recommended to further clarify the relationship and interaction between mutations in HBV genome and HCC progression.


Introduction
Chronic HBV (CHB) infection affects 296 million people worldwide in 2019 [1], and has been considered as a major global health problem. CHB infection is the leading cause of liver cirrhosis and contributes over 50% of hepatocellular carcinoma (HCC) [2]. Vietnam locates in the HBV-high-prevalence area in Asia [3] which has the high incidence of HBV related-HCC [2]. It was reported that 62.3% of HCC cases and 81.3% of advanced HCC cases in Vietnam were HBV infected [4,5].
HBV belongs to the Hepadnaviridae family with an incomplete double strand DNA genome, which carries 4 overlapped open reading frames, coding for 4 proteins PreS/S, pre-Core/Core, Pol, and X. HBsAg proteins are trancsripted from PreS/S open reading frame that consists of 3 surface proteins: Small (S), Medium (M) and Large (L). The S protein that drives the releasing of viral particles consists of 227 amino acids (aa). The M protein that enriches the secretion of virion contains an extra N-terminal extension of 55 aa. The L protein that is involved in the interaction with core particles in the packaging of virion at the endoplasmic reticulum (ER) has a further N-terminal extension of 108 or 119 aa-depending on genotypes [6]. In the absence of L protein, S protein is secreted alone as noninfectious subviral particles. L protein can suppress the subviral particle secretion depending on the L/S protein ratio. During natural HBV infection subviral particles outnumber virions by a factor of 1000:1. The principal epitopes of HBsAg mainly locate in the "a" determinant (aa 124-147) in the major hydrophilic region (MHR) induces the neutralized B cell responses.
Mutations in PreS/S gene result in deletion of surface proteins or synthesis of varieties of truncated proteins. Especially, mutations on the C-terminal region (aa 179-226) of S gene contribute in retention of HBsAg within the hepatocytes ER [7,8], activate multiple oncogenic signal pathways, promote the growth of hepatocytes and eventually lead to HCC development. Multiple scientific evidences had proved PreS mutations as prediction markers for HCC development and recurrence of HBV-related HCC [9][10][11][12]. The relationship between PreS/S gene mutations and HCC were studied distinctly on the PreS region [13]. Mutations at T53C, PreS deletions, PreS2 start codon, C7A, A2962G, C2964A and C3116T in the PreS region have been proved that significantly increase risk of HCC [14,15]. Wang et al. (2006) had concluded that the retention of L antigens from the PreS mutants caused ER stress, induced oxidative DNA damage, and resulted in genomic instability. The L antigens from the PreS mutants are attributed to the upregulation cyclooxygenase-2 and cyclin A, and promotion of cell cycle and hepatocytes proliferation [16].
Mutations on the S gene in Vietnamese CHB patients had been described in the earliest study since 2012. Dunford et al. (n = 187) had reported a rate of 31% cases with point-mutation in the immunodominant 'a' region, especially two major vaccine escape mutations with minor rates as G145A/R (2.2%) and P120L/Q/S/T (5.3%) had been detected [17]. The mutation of PreS deletion with a rate of 20% was reported from Matsuo et al. (2017) in Vietnamese CHB-infected patients [18]. Bui TTT et al. (2017) described more concretely about point-mutations (N38E 71.9%, N38K 71.1%, A60V/E 100% on the PreS1 region, L126T/S 77% on the PreS2 and N3S 27.4% on the S region) [19]. In a multicenter study on 660 patients from China, Korea and Vietnam, Kim et al. (2018) [20] had reported 237 amino acid mutations in the MHR on the S region. There was not any report on mutations and their association with HCC on the PreS/S gene in Vietnamese patients.
In this study, we described the characteristics of HBV PreS/S gene mutations in Vietnamese patient with CHB and analyzed the relations of these mutations with HCC.

Study design and population
The cross-sectional study had been conducted at the Hepatology Clinic of University Medical Center (UMC), Ho Chi Minh city from February 2020 to March 2021. There were 293 male and female CHB patients paticipated in this study, who met the inclusion criteria of being older than 18 years old, had HBsAg positive more than 6 months, no previous nucleos(t)ide analogues treatment (NAs) and HBV DNA >3 log 10-copies/mL. Their serum samples were extracted from 4 mL blood, stored in -80 celcius degree for PreS/S gene sequencing. Serum samples of 247 patients that had been successfuly sequenced the HBV PreS/S gene were analyzed for the final results. Among them, there were 212 CHB patients whose serum samples had been stored during 2014-2016 and 35 CHB patients were recruited in 2020-2021.

Variables and measurements
Personal characteristics, times from diagnosis of CHB infection and HBV markers were collected from the hospital electronic database. HCC was defined for cases with tumor detected on abdominal ultrasound, serum alpha-fetoprotein (AFP) >20 ng/mL and was confirmed on abdominal CT scan with focal lesions with early arterial phase enhancement and rapid "washout" in venous phase [21]. Cirrhosis was defined as having signs of portal hypertension (splenomegaly, ascites, vascular collaterals on abdominal wall, esophageal varices or portal hypertensive gastropathy on gastroscopy) and signs of insufficiency of liver function (palmar erythema, vascular spiders, low concentration of albumin (<35 g/dL), elevation of the international normalized ratio (INR >1.1), thrombocytopenia (<160,000/mm 3 )) as well as irregularity of hepatic surface on ultrasonography or F3 on Metavir score using Acoustic radiation force impulse (ARFI) measurement [22].
HBsAg quantification (Elecsys HBsAgII Quant-Roche kit), HBeAg (Cobas-Roche kit) and HBV DNA quantification (using the AccuPid HBV Quantification kit (KT-Biotech)) with limits of detection �300 copies/mL, linear range: 300 to 10 8 copies/mL) were performed at the University Medical Center laboratory. HBV genotype (B or C) was determined based on the sequence of S gene. used to amplify the whole PreS/S region of HBV with TaKaRa Taq TM HotStart Polymerase (Takara Bio, Shiga, Japan). Primers for the PreS1/PreS2 region were: FA2-L (5'-TTGAGA-GAAGTCCACCACGAG-3') and FA2-R (5'-GCGTCGCAGAAGATCTCAAT-3'); S region were FA3-L (5'-CTGCTGGTGGCTCCAGTT-3'); FA3-R (5'-GCCTTGTAAGTTGGCGA-GAA-3'). PCR involved initial denaturation at 98˚C for 3 min followed by 45 cycles of 98˚C for 10 sec, 58˚C for 30 sec, and 72˚C for 60 sec, with a final elongation of 72˚C for 2 min. PCR products were checked for size and purity using 1.5% agarose gel electrophoresis. PCR product was purified enzymatically using the ExoSAP-IT TM PCR Product Cleanup Reagent (Thermo Scientific, Waltham, MA) to remove excess primers and dNTPs before Sanger sequencing using the BigDye Terminator v3.1 Kit and the ABI 3500 Genetic Analyzer (Applied Biosystems, Foster City, CA). PCR fragment was sequenced and analyzed in both directions. The sequences were compared to the reference sequence of genotype B (GenBank_AB073846) and genotype C (GenBank_X04615) using the CLC Main Workbench software (Qiagen, Germany).

Statistics
The SPSS 25.0 software was used to analyze the data. Percentage was used to present the rates of point-mutations, the rates of possessing at least one mutation and the rates of genes deletion on each region. The Chi-square test (or Fisher exact test) was used to compare the distributions of mutations among groups with or without HCC. Multivariable analysis with logistic regression was used to find out factors related to HCC. Two-side p value of <0.05 was considered statistically significant.

Ethics considerations
The study was done based on the Declaration of Helsinki. Stored serum samples and all variables included in the study was approved by the Ethics Committee of the University of Medicine and Pharmacy at Ho Chi Minh City (Reference number: 119/HDDD). Informed consents were obtained from all participants prior to the study.

Characteristics of the study population
The study included 247 CHB patients, 68.8% were males, 57.9% were older than 40 years of age and older. 57.5% were positive with HBeAg marker, 83% had HBV DNA �5 log 10 copies/ mL, 54.7% genotype B. 23.1% were with liver fibrosis and 19.8% were with HCC ( Table 1).
In the S region, 61.2% amino acid changes were detected (139/227). The rate of cases with at least one point-mutation detected on the "a" determinant region (aa 124-148) was 39.7% and on the MHR region (aa 100-160) was 63.6%. The HCC group had significant higher rate of possessing �1 point-mutation on the MHR region (79.6% vs 59.6%, p = 0.009).

The point-mutations on the Pres1/Pres2/S genes related to HCCmultivariable analysis
Nineteen point-mutations that distributed differently (p<0.1) among the HCC and non-HCC groups were checked to remove their interactions using multivariable analysis (Tables 2-4). Six point-mutations remained related to HCC. They were one mutation on the PreS1 region: W4P/R/Y (OR = 5.48, 95%CI 1.32-22.83) and 5 mutations on the S region: F20S (OR = 9.72, Table 2

PLOS ONE
Mutations in the HBV PreS/S gene related to HCC

PLOS ONE
Mutations in the HBV PreS/S gene related to HCC

Discussion
To the best of our knowledge, this investigation was one of the first studies on PreS/S gene mutations and their relation with HCC in Vietnamese CHB infected patients. The study population included CHB infected patients with HBV DNA >3 log 10-copies/mL (for the higher chance of mutation detection) and had successful PreS/S sequencing (for better mutation description and analysis its correlation with HCC). The rate of mutations that were presented in this study therefore might be higher than that of the real HBV infected population in Vietnam. Our study sample composed of 54.7% genotype B, 57.5% HBeAg positive, 23.1% liver fibrosis of >F3, 83% HBV DNA >5 log 10 -copies/mL and especially 19.8% HCC accompanied. These special characteristics on the population were not only presented the variables that need to be adjusted for their confounding effects but also ensured the aim of detection of mutations and its relationships with HCC. There were 61.8% amino acid replacements that were detected on the entire PreS/S gene. The rates of changes that were higher on the PreS2 and S gene (74.5% of 228 and 70% of 55 amino acid sites, respectively, versus 56.7% of 120 amino acid sites on the PreS1) revealed the high variability of these regions.

PLOS ONE
point-mutations were mostly not related to HCC. Contrarily, 4 point-mutations that belong to the low-rate group were related to HCC. They were W4P/R/Y and S5L/T (p = 0.055) on the NTCP region; A90T/S/G on the HSP70 region and L108V/I on the S promoter and B cell epitopes (Table 2).
On the PreS2 region that consists of 55 amino acids, 74.5% amino acid changes were detected with the mutation rates ranged from 0.4% to 42.5%. 82.9% amino acid changes (34/ 41) belong to the group of <5% rates. Only F141V/L/I had the higher distribution in the HCC group (18.4% vs 9.6%, p = 0.08) ( Table 3). Our finding seemed compatible with a report from Mun et al., who had found that F141L mutation strain increased the risk of HCC in HBV genotype C infected subjects [23]. They had also proved the enhanced cell cycling effects of F141L-expression cell lines through the doubling frequencies of colony-forming versus the wild types.
Moreover from our study, we often found the lower rates of amino acid changes compared to other studies such as from Bui [20]. Inversely, we detected higher rates of S point-mutations as L21S (29.1%), S53L (37.7%), A184V/G (39.3%), S204R/N (10%) and S210K/N/R/S (39.3%), and also on the "a" determinant (39.7% cases with mutation, compared to 7% from Hudu's group [30]. In spite of these lower and higher amino acid change rates, all of these mutations were found not related to HCC in our cross-sectional study. These differences in rates among studies could not only be explained by the distribution of genotype and by the varieties of subgroups in the study populations (such as the co-existence of HBsAg-AntiHBs status, nucleot(s)ide or immunoglobulin treatment, liver cirrhosis and HCC). Moreover, among our study population, antiHBs that had

PLOS ONE
been tested on 186 cases with clinical symptoms were tested antiHBs and had detected 37 cases (19.9%) with HBsAg-AntiHBs co-exsistence, higher than the rates of 3-5% in other investigation [32,33]. Therefore, it was presumed that CHB patients with varieties of presentations had been included in our study and contributed to the difference in rates of point-mutations compared to other studies. We also found a significant higher distribution of cases with mutation on the MHR region (p = 0.009) in the HCC group, especially a higher rate of P120S/T. Outside of the MHR region we also detected higher rates of other 3 S mutations T47A/E/V/K, S174N (in the HLA II region) and P203S (in the HLA II region, the C-terminal domain) in the HCC group (Table 4). Hossini et al. (2019) had previously found the higher rate of P120T/S in HCC with cirrhosis group [34]. Qiao et al. (2017) had also reported that the N-glycosylation mutations on the MHR region accompanied with HBsAg-antiHBs co-existence was related to HCC [35]. Liu et al. (2013) further stated that the large N glycosylation of HBsAg antigen modulates HBsAg secretion, causes ER stress, expresses cell cycle and cell proliferation [36].
The mutant strains with amino acid changes at T or B cell epitopes on the PreS region can escape the immune surveillance that prolong the HBV infection. Mutants at specific regions of PreS/S genes may create premature stop codons, produce abnormal truncated proteins, disbalance the synthesis of surface proteins, result in retaining of HBV inside of the host cells, promote the endoplasmic reticulumn stress pathway, cause DNA oxidative damage and genome instability, upregulate cell cycles and lead to malignant transforming of hepatic cells [7,[25][26][27]. The PreS/S mutant strains enhance cell cycle progression through the down-regulating effects on the p53 and p21 pathways; upregulate the cyclin-dependent kinase 4, cyclin A, hamper HBsAg secretion, increase cellular proliferation [8].
Many other concerns related to the mutation strains and its replacements on virion secretory defect (at amino acid 172 on S gene) (Warner et al. [37]), on cell proliferation and transformation effects (at amino acids 95, 182, and 216 on S gene) (Huang et al. 2014 [38]) or predisposition of the HCC development (at amino acids 69, 95, 182, 216, 210 on S gene) [8,38]. However, all of these concerned point-mutations were not found related to HCC in our study.
By study on liver tissue of HCC patients (2008), Hatazawa et al. had detected 2 PreS mutations (W4R and A60V) and more other PreS amino acid replacements at codon 5, 30, 35, 5, 54, 77, I84, 98, 102, 118, 123 and 124 [31]. Chen et al. (2008) had also reported W4P/R and other changes at codons 7, 81 on the PreS1, and at codon 68 on the S region related to HCC [39]. Several years later, the significantly higher frequencies of 3 PreS mutations at codons 4, 60 and 125 in HCC patients were recorded by Yin et al. (2010) [40], Zhang et al. [28]. Interestingly in a longitudinal study, Zhang had also observed the increasing of quasi-species complexity and diversity of the HBV strains during the progression to HCC; He had specially stated that the majority of these mutations existed at least 10 years in advance of development of HCC [41]. Zhang et al. (2017) had also repeatedly reported significantly higher rates of PreS deletion and other PreS mutations at codons 4, 27 and 167 in the HCC group [28].
Scientific reviewed on point-mutations that related to HCC, we realized that there were big differences on the patterns and characteristics of the amino acid changes related to HCC between studies. These differences might originate from the structure of study populations, HBV genotypes, the large spectrum of amino acid changes along the PreS1/PreS2/S sequences, and the interactions between mutations.
The multivariable analysis was applied twice in our study. Firstly, to adjust interactions between 19 point-mutations that had showed higher rates on the HCC group and recognized 6 mutations which had higher risks of HCC (Table 5). Secondly, to adjust for the confounding effects of personal and viral factors ( Table 6). The final findings had recognized 5 mutations (W4P/R/Y on the PreS1 region and T47A/E/V/K, P120S/T, S174N, P203R on the S region) that significantly related to HCC. The findings that related to the first 3 mutations that were in agreement with other published papers, except the P203R which had not been well reported. Salpini et al. (2017) had stated that P203Q and the combination of P203R and S210R hampered the HBsAg secretion and increased cellular proliferation. The correlation of the C-terminus P203Q (17.4% vs 1.0%, p = 0.004), S210R (34.8% vs 3.8%, p<0.001) and of their combination with HCC had been reported in genotype A and D CHB patients [8].
Regarding to the OR values of mutations on the final multivariable analysis, 2 S mutations including 23 cases of T47A/E/V/K (S) and 21 cases of P120S/T (S) had revealed three folds increase in HCC risk associated with reasonable confidential intervals. On the contrary, three remaining mutations had only been detected on small numbers of cases with especially high ORs and wide 95% confidential intervals including 3 cases of W4P/R/Y (preS1), OR 11.56 (1.99-67.05); 4 cases of S174N (S), OR 29.73 (2.12-417.07); and 8 cases of P203R (S), OR 8.45 (1.43-50.06) ( Table 6). A small sample size of this study resulted in a wider confidence interval with a larger margin of error for these sporadic mutations. It was suggested that a tighter confidence interval with values closer to the actual OR would be obtained if the sample size was increased. We had calculated the predictive values of these 5 point-mutations and had all found the high SPE and NPV values, but all revealed modest SEN due to small number of cases. S174N (S) for instance had been observed in our study with the highest OR and relative high PPV, NPV and specificity (75%, 81.1% and 99.5%, respectively) but its sensitivity was only 6.1% (S3 Table). If possible, the deep sequencing technique with its higher sensitivity could either potentially increase detection rates or improve the SEN values of these low frequency point-mutations. However, we were unable to perform it this time due to a large cost associated with the technique. Further studies are recommended in continuing upon findings of this study in which the direct sequencing would be the best and compulsory technique for better recognizing point mutations at quacispecies levels.
Contrarily, frequent amino acid replacements in our study were detected at the widely known structural and functional sequences such as N51Y/T/S/Q (30.4%), V68T/S/I (44.9%) on the S promoter; T/N87S/T/P (46.2%) on the HSP 70 (heat shock protein) and T125S/N/P (30.8%) on the NBS region. These structures are concerned by their role on the structure and morphology of HBV, the dual topology of L proteins (HSP70), the CAD-Cytosolic anchorage determinant), the virion morphogenesis (NBS-The nucleocapside binding site) and the S RNA transcription (The S promoter and the CCAAT/CBF) [42]. At the cellular level, the mutations at these functional regions has been known to contribute to the production and secretion of surface proteins, on the intracellular retention of envelope proteins and on the endoplasmic reticulum (ER) stress [43]. However, these above point-mutations had equally distributed on the HCC and non-HCC group in our study. More longitudinal cohorts need to be continued apart from this population because the diseases and HCC outcomes need at least one or more decades to appear.
There were some HCC related factors that were not included in the multivariable analysis such as Basal core promoter mutations, history of vaccination, HBsAg-antiHBs co-existence status, HBV genotype, cirrhotic status. Also, the combination of mutations and their interactive effects had not yet been analyzed. Other limitations of our study were also rooted from the study population that was not large enough for the low-rate mutations. A wide spectrum of significant mutations on the 3 regions (PreS1, PreS2 and S) and the interactive effects between mutations that need to be concretely clarified.
Further larger investigation and observation longitudinal studies were in need to be done to describe and analyse the relation between PreS/S mutation and HCC. 61% amino acid changes with a broad range of mutation rates were detected on the PreS1/ PreS2/S regions of chronic HBV infected patients. The W4P/R/Y (on preS1 region) and T47A/ E/V/K, P120S/T, S174N and P203R (on S region) were found related to HCC. Further investigation included cohort studies are recommended to continue to further investigate the relation of mutations on the HBV genome and HCC outcome.
Supporting information S1 Table. Distributions of 6 mutations related to HCC in groups of personal and HBV characteristics.